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Abstract 

Objectives To investigate the effectiveness of non-benzodiazepine 
hypnotics (Z drugs) and associated placebo responses in adults and to 
evaluate potential moderators of effectiveness in a dataset used to 
approve these drugs. 

Design Systematic review and meta-analysis. 

Data source US Food and Drug Administration (FDA). 

Study selection Randomised double blind parallel placebo controlled 
trials of currently approved Z drugs (eszopiclone, zaieplon, and 
Zolpidem). 

Data extraction Change score from baseline to post-test for drug and 
placebo groups; drug efficacy analysed as the difference of both change 
scores. Weighted raw and standardised mean differences with their 
confidence intervals under random effects assumptions for 
polysomnographic and subjective sleep latency, as primary outcomes. 
Secondary outcomes included waking after sleep onset, number of 
awakenings, total sleep time, sleep efficiency, and subjective sleep 
quality. Weighted least square regression analysis was used to explain 
heterogeneity of drug effects. 

Data syntliesis 13 studies containing 65 separate drug-placebo 
comparisons by type of outcome, type of drug, and dose were included. 
Studies included 4378 participants from different countries and varying 
drug doses, lengths of treatment, and study years. Z drugs showed 
significant, albeit small, improvements (reductions) in our primary 
outcomes: polysomnographic sleep latency (weighted standardised 
mean difference, 95% confidence interval -0.57 to -0.16) and subjective 



sleep latency (-0.33, -0.62 to -0.04) compared with placebo. Analyses 
of weighted mean raw differences showed that Z drugs decreased 
polysomnographic sleep latency by 22 minutes (-33 to -1 1 minutes) 
compared with placebo. Although no significant effects were found in 
secondary outcomes, there were insufficient studies reporting these 
outcomes to allow firm conclusions. Moderator analyses indicated that 
sleep latency was more likely to be reduced in studies published earlier, 
with larger drug doses, with longer duration of treatment, with a greater 
proportion of younger and/or female patients, and with Zolpidem. 

Conclusion Compared with placebo, Z drugs produce slight 
improvements in subjective and polysomnographic sleep latency, 
especially with larger doses and regardless of type of drug. Although 
the drug effect and the placebo response were rather small and of 
questionable clinical importance, the two together produced to a 
reasonably large clinical response. 

Introduction 

Hypnotic drugs are often prescribed in primary care for 
insomnia. ' Despite a reduction in prescribing of benzodiazepine 
hypnotics in the past decade, hypnotic use and costs remain 
high because of the introduction and increase in use of Z drugs,^ 
a group of non-benzodiazepine hypnotic drugs (including 
eszopiclone, zaieplon, and Zolpidem), which act on the GABA 
(Y aminobutyric acid) receptor and are used in the treatment of 
insomnia. These are now the most commonly prescribed 
hypnotic agents worldwide. Prescriptions exceed costs of S285m 
(£178m, €221m) in the United States' and £25m (€31m, $40m) 
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in the UK." Although widely prescribed, Z drugs are not without 
risks. These include adverse cognitive effects (such as memory 
loss), psychomotor effects (such as falls, fractures, road traffic 
crashes), daytime fatigue, tolerance, addiction, and excess 
mortality^ with no significant difference from benzodiazepines.'' 
These established risks need to be weighed against the benefits. 

Previous meta-analyses'' ' of clinical trials of Z drugs have been 
prone to publication bias, such as unavailability of unpublished 
trials, selective or duplicate publication, and selective reporting 
of results in constituent studies.'" " An example of the distorting 
effects of these publishing practices was shown in a study by 
Mattila and colleagues.'^ This study compared European Public 
Assessment Reports of three drugs for insomnia to identify 
clinical trials that were performed between 1998 and 2007 for 
the purpose of registration of these drugs in the European Union. 
They found that the effect size of these drugs was 1.6 times 
larger when it was based on published data compared with the 
whole sample of studies, published and unpubhshed. They also 
found "remarkable inconsistencies in the reporting of the 
secondary end points, methods, results and, especially safety." 
Different characteristics of included studies have not been 
examined as possible moderators of effects of Z drugs in 
previous meta-analyses. 

One way of reducing the problem of publication bias is to 
analyse the effect of drugs that have been approved by 
governmental agencies with data derived from regulatory 
submissions. '^ Drug companies are required to provide 
information on all sponsored trials, published or not, when 
applying for new drug approvals.'" Hence, the US Food and 
Drug Administration (FDA) files contain a complete dataset of 
published and unpublished trials up to the date of drug approval. 
We therefore undertook a meta-analysis of randomised placebo 
controlled parallel group studies of clinical effectiveness of Z 
drug hypnotics for insomnia in adults using only data provided 
to the FDA for drug approval. 

Another concern with studies of hypnotics is the magnitude of 
the placebo response. We have considered the distinction 
between drug and placebo responses and drug and placebo 
effects." A drug response is the change that occurs after 
administration of the drug. The effect of the drug is that portion 
of the response that is due to the drug's chemical composition; 
it is the difference between the drug response and the response 
to placebo. A similar distinction can be made between placebo 
responses and placebo effects. The placebo response is the 
change that occurs after administration of a placebo. It includes 
such factors as improvement because of the natural course of 
the condition and regression toward the mean, as well as the 
placebo effect itself. 

Previous studies have shown significant improvements in 
placebo arms in placebo controlled trials of hypnotic drugs." '* 
Assessment of the magnitude of the placebo effect is important 
for understanding drug-placebo differences and their 
implications for clinical practice. For example, a small 
drug-placebo difference might lead to different treatment options 
if the drug and placebo are both effective rather than if neither 
are effective. Because change in the absence of placebo 
administration is rarely assessed in randomised controlled trials 
(and was not assessed in the trials contained in the FDA files), 
we could not assess the placebo effect. Therefore, we assessed 
changes in placebo groups, as well as those in drug groups, thus 
allowing us to establish the magnitude and significance of 
placebo responses, drug effects, and other variables that can 
moderate these outcomes. 



Methods 
Data source 

For this systematic review we adhered to PRISMA 
guidelines.''' '° We obtained data on all currently approved 
(non-benzodiazepine) Z drugs: eszopiclone, zaleplon, and 
Zolpidem from the FDA website (see appendix 1). 

Study selection 

The criteria for inclusion were randomised double blind 
controlled trials, recruitment of adults with primary insomnia 
(transient or chronic), an intervention comparing a Z drug with 
a placebo control, submission to the FDA before approval, 
sponsored by the manufacturer, and studies from any country 
or reported in any language (although we found only reports in 
English). Studies were excluded if they were crossover designs, 
included healthy patients with normal sleep, were single night 
studies with induced insomnia, or did not report any inference 
test or enough descriptive information (for instance, percentages 
or means and a variability measure for both groups and/or both 
time measures) as included studies were too heterogeneous and 
not large enough to estimate the missing information to calculate 
an effect size. We excluded crossover trials because of problems 
associated with reactivity, learning, carry over effects, and 
failure of blinding. Blinding failure is more likely with crossover 
studies, leading to an enhanced placebo effect in the drug 
treatment arm, thereby increasing the likelihood of a false 
positive (type I) error. We did not include post-approval trials 
in our analysis because it is not possible to obtain access to all 
unpublished data for those trials. 

Data extraction and quality assessment 

Two independent trained raters extracted information related 
to the study with high inter-rater reliability: mean Cohen's k 
0.90, for categorical variables, and mean intraclass correlation 
r=0.92 for continuous variables. Because of the nature of the 
FDA data, extractors were blind to researchers and institutions. 

Methodological quality was assessed with the Jadad scale"' ^' 
as adapted by Miller and colleagues"' (see appendix 2). For each 
study, we extracted statistical data for drug and placebo. We 
also coded sample and study characteristics and included 
dimensions such as study identifier, year of publication, 
location/s of study (country and number of sites), and study 
duration. Data were extracted for two primary outcomes and 
eight secondary outcomes. The primary outcomes were 
polysomnographic and subjective sleep latency. The secondary 
outcomes were subjective and polysomnographic total sleep 
time, subjective and polysomnographic number of awakenings, 
subjective sleep quality, sleep efficiency, and subjective and 
polysomnographic time awake after sleep onset. Measured 
characteristics of participants included proportion of women, 
age, and sample type (outpatients, elderly, etc). Design 
characteristics included design type, recruitment method, 
intervention drug(s), treatment duration, and statistics reported. 
None of the trials reported race or ethnicity. 

Data synthesis and analysis 

For both measures of sleep latency (polysomnographic and 
subjective) and the eight other sleep related outcomes, we 
calculated effect sizes as the mean difference between pre-test 
and post-test divided by the standard deviation (SD) of the 
pre-test value"" for each group separately (that is, repeated 
measures effect sizes, correcting for sample size bias)."' The 
standardised mean change in the placebo group was subtracted 
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from that of the intervention group to evaluate the drug effect 
with respect to the placebo effect for each comparison (that is, 
effect size between groups adjusted for baseline). We calculated 
multiple effect sizes if the study reported more than one drug 
group or multiple outcomes. The latter were analysed separately 
to investigate main effects and moderators. For multiple dose 
studies, with the same or different drug but with the same control 
group, we controlled multi-treatment dependence by estimating 
the covariance among them^'' to analyse the effect of different 
drug and dose combinations. The sign of the effect size was set 
so that negative values signified a decrease of waking after sleep 
onset, sleep latency, and number of awakenings (all both 
polysomnographic and subjective). The sign of the effect size 
was positive for increases in polysomnographic and subjective 
total sleep time, polysomnographic sleep efficiency, and sleep 
quahty. 

We obtained repeated measures effects sizes for each group for 
those comparisons reporting means and SDs and used medians 
and interquartile ranges as the best approximation for those 
studies missing mean and SDs for drug and placebo. Sensitivity 
analysis was undertaken by comparing the main results, with 
and without those comparisons where median and interquartile 
range were used to obtain a standardised mean difference. 
Transformations were conducted to obtain effect sizes between 
groups for those cases where F test or P values were reported.^^ 
As 63% (41) of the comparisons did not report SDs of each 
group and those provided were largely heterogeneous, we have 
reported repeated measures results for only six studies (24 
comparisons) with the most complete statistical data; two studies 
were not included in the final analysis as they did not report 
any inference test or variability measure either in their repeated 
or two groups measures. We have reported effect sizes in their 
raw metric for the same comparisons in parallel to faciUtate 
clinical interpretation. 

We examined the effect sizes with random effects models"^ 
for weighted effect sizes and publication bias. Random effects 
models are more robustly generalisable as they assume 
variability not only within studies but also between studies, a 
relevant assumption when studies from different populations 
are integrated to account for sampling error and population 
variance. Moderation patterns were examined under mixed and 
fixed effects assumptions, but we have reported results only 
under the latter assumptions because of the lack of power to 
show any significant pattern under mixed effects models."'* The 
homogeneity statistic, Q, determined whether each set of 
weighted mean effect sizes shared a common parametric effect 
size: a significant Q indicates a lack of homogeneity. To assess 
not only significance of the heterogeneity but also its size, we 
calculated the I" index and its corresponding 95% confidence 
intervals'" to determine and compare across outcomes the extend 
of the heterogeneity. I" varies between 0 (homogeneous) and 
100% (non-homogeneous), and if the confidence interval around 
I' includes zero, the set of effect sizes is considered 
homogeneous. We investigated possible asymmetries in the 
distribution of the effect sizes, which could indicate reporting 
bias, using the trim and fill technique,'^ Begg's strategy and 
Egger's test." We analysed the total Jadad score as well as 
individual item scores to detect any possible bias effect on the 
overall results. Finally, we conducted sensitivity analyses with 
effect sizes with more than 2 SD from the average effect size. 

Moderator analyses were conducted for the main outcomes, 
polysomnographic and subjective sleep latency. To explain 
possible moderation of the variability of the overall effect sizes, 
we examined the relation between sample, methodological, or 
condition characteristics and magnitude of effect using a 



modified weighted least squares bivariate regression analyses 
with weights equivalent to the inverse of the variance for each 
effect size.'^ Because doses of different drugs are not 
equivalent, we also tested the drug by dose interaction. We used 
total score on the methodological quality scale as a moderator 
to analyse possible interaction with the final weighted effect 
sizes and have presented any significant pattern for either sleep 
latency or its subjective measure. 

Results 

Description of studies 

In the data obtained from the FDA website, we identified 13 
clinical trials comprising 4378 participants that examined 65 
separate drug-placebo comparisons by type of outcome, type 
of drug, and dose and that met the inclusion criteria. Figure 1 
shows the trial flow||. Table 111 and appendix 3 provides 
descriptive features of the studies. Methodological quality of 
the studies ranged from 13 to 21 on the Jadad scale (mean 15.63, 
SD 1.8). Publication year and quality score were not 
significantly correlated (r=0.34, P=0.28). 

Studies were conducted in North America (eight studies). North 
America and Europe (one study). South America (one study), 
or Australia (one study), with one study conducted entirely in 
Europe, and another study without location information. The 
mean duration of studies was 33.9 days (SD 33.3, range 14-180 
days). Of the 4378 participants sampled, 61% were women, 
61% were aged under 45, and the mean age was49.6(SD 13.3; 
range 38-72) years. 

All 1 3 studies included comparisons of at least one of our 
primary outcomes. Ten studies (22 comparisons) assessed 
polysomnographic sleep latency and seven ( 1 1 comparisons) 
assessed subjective sleep latency. The eight remaining secondary 
outcomes appeared in fewer studies: four studies (seven 
comparisons) assessed subjective total sleep time, two (two 
comparisons) assessed total polysomnographic sleep time, four 
(six comparisons) assessed subjective number of awakenings, 
three (four comparisons) assessed polysomnographic number 
of awakenings, two (four comparisons) assessed subjective sleep 
quality, three (five comparisons) assessed sleep efficiency, three 
(three comparisons) assessed polysomnographic waking after 
sleep onset, and one (one comparison) assessed subjective 
waking after sleep onset. 

Zolpidem was most commonly prescribed drug (eight studies); 
eszopiclone and zaleplon were assessed in three studies each 
(one study included both Zolpidem and zaleplon). Zolpidem 
was prescribed in eight studies (15 comparisons) measuring 
polysomnographic sleep latency, zaleplon in three studies (six 
comparisons), and eszopiclone in only one study (one 
comparison). Only Zolpidem and eszopiclone were used in 
studies measuring subjective sleep latency, in five (eight 
comparisons) and two (three comparisons) studies, respectively. 

Quantitative analyses 

For our primary outcomes, analyses of standardised effect sizes 
showed significant but small to medium differences in 
polysomnographic (weighted standardised mean difference 
-0.36, 95% confidence interval -0.57 to -0. 16) and subjective 
sleep latency (-0.33, -0.62 to -0.04) for treatment versus 
control. There were significant effect sizes for the primary 
outcome (sleep latency) within groups separately for both 
placebo (-0.39, -0.54 to -0.23 (for polysomnographic); -0.33, 
-0.63 to -0.03 (for subjective)) and drug (-0.93, -1.32 to -0.54 
(polysomnographic); -0.67, -1.30 to -0.03 (subjective)). 
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Analyses of weighted mean raw differences indicated that drugs 
decreased sleep latency by 22 minutes (-33 to -11 minutes). 

Tables 2 and 3 show standardised and raw effect sizes, 
respectively. lill Figures 2 and 3 show forest plots for 
polysomnographic and subjective sleep latency, respectively.^JJ, 
Analysis of secondary study outcomes showed no significant 
drug effect. The lack of difference between groups for other 
sleep measures coupled with the fact that few reports included 
them meant there was insufficient evidence to show efficacy 
on these measures. 

There was no evidence of asymmetry of the distribution of the 
effect sizes for sleep latency by the trim and fill technique, 
Begg's test'"' (P=0.88 (polysomnographic); P=0.22 (subjective)), 
or Egger's test'" (P=0.34 (polysomnographic); P=0.77 
(subjective)), which suggests that these results are not 
significantly affected by publication bias. One study was an 
outlier (eszopiclone study No 190-047, table 1 J and appendix 
3), with a large pooled effect size for sleep latency 0.46 (0.24 
to 0.69) and >2 SD from the overall weighted effect size. When 
we excluded this study, the pooled effect size for sleep latency 
was -0.54 (-0.91 to -0.15). Sensitivity analysis showed no 
significant differences in overall efficacy (the overall effect size 
without the outlier was still significant, with a slightly lower 
reduction of subjective sleep latency) and the same patterns for 
the moderator results adjusted for the two outUer comparisons 
provided by this study. 

Every item was evaluated through bivariate weighted regression 
analysis under fixed and random effects assumptions to critically 
and robustly appraise any included study for risk of bias in 
attributing outcomes to the intervention and their possible effect 
on the overall efficacy, but none of the results was significant. 
Therefore, there was no evidence of any interaction between 
quality/risk of bias in the included studies and the final results. 

Moderator effects on sleep latency 

The main outcomes, polysomnographic and subjective sleep 
latency, were the only measures with sufficient cases to permit 
detailed models for moderator analyses (table 4|1). Sleep latency 
was more likely to be reduced in studies published earlier, with 
larger drug doses, longer treatment duration, and samples that 
included a greater proportion of younger patients and/or female 
patients (table 4I|). Polysomnographic and subjective sleep 
latency were reduced when larger doses were used, regardless 
of type of drug. The interaction of dose by type of drug was not 
significant, and all drugs (Zolpidem and zaleplon for 
polysomnographic and subjective sleep latency and eszopiclone 
and Zolpidem for subjective sleep latency, the latter being 
significantly more effective in this particular outcome) showed 
a pattern of greater reductions in sleep latency with larger doses. 
Subjective sleep latency was more likely to be reduced in studies 
published earlier, or with greater numbers of younger patients 
or women included in the sample, and with Zolpidem. These 
patterns were obtained under fixed effect meta-regression models 
and these held under mixed effects assumptions. 

Discussion 
Main findings 

In this meta-analysis of Z drugs using data published on the 
FDA website, which are less likely to be affected by selection 
or reporting bias, we found significant reductions in 
polysomnographic and subjective sleep latency in both drug 
and placebo groups. The difference between drug and placebo 
was 22 minutes for polysomnographic sleep latency and seven 



minutes for subjective sleep latency. Although these reductions 
in sleep latency might have benefits, albeit short term, for quality 
of life, the effect sizes corresponding to these differences were 
-0.36 and -0.33, both of which are conventionally considered 
to be small effects,'*" and well below the criterion for clinical 
significance (0.50) suggested by the National Institute for Health 
and Chnical Excellence (NICE) in their guidelines for the 
treatment of depression.'' 

There were insufficient data for other drug effect end points to 
allow a valid analysis. The large heterogeneity in sleep latency 
outcomes was mainly explained by larger doses needed to obtain 
a greater drug than placebo effect. Z drugs were more likely to 
be effective in reducing sleep latency in studies published earlier, 
those including more younger and/or female patients, and those 
using Zolpidem. Significant placebo responses were present in 
polysomnographic and subjective sleep latency. There have 
been several previous meta-analyses of published data on Z 
drugs, although none included moderator analyses and all 
acknowledged publication bias.* 

Strengths and weaknesses 

As in previous studies, we found that data submitted for 
licensing enabled detailed investigation of drug efficacy." " ""' 
We included sponsored studies submitted to the FDA but did 
not assess whether they were subsequently published. Studies 
submitted to the FDA are required to report all data so are less 
likely to be affected by reporting bias. 

Studies were subjected to the same methodological scrutiny and 
analytical rigour as meta-analyses of published studies. As in 
other meta-analyses, we did not include studies that did not 
report enough statistical data to calculate an effect size. Because 
of the small number of reports for some outcomes, and the 
heterogeneity of statistical data reported, we could not compare 
some studies directly or robustly impute missing data. There 
was insufficient information about sample setting characteristics, 
drug side effects, and other factors that might have explained 
heterogeneity to fully account for these. The entry criteria for 
studies varied, with some studies focusing just on sleep latency, 
particularly for shorter acting drugs such as zalpelon. This could 
have affected the capacity of some studies to identify effects 
other than on sleep latency. All the drugs are licensed for 
insomnia, and patients presenting for treatment have a range of 
symptoms, not just sleep latency, for which these drugs are 
commonly prescribed in general practice. 

Another weakness in the present analysis is that all the trials 
were industry sponsored. Industry sponsorship has been shown 
to enhance the outcome of clinical trials."' Thus, although we 
were able to include published and unpublished studies, at least 
for the reports used to approve these drugs, we could not avoid 
sponsorship bias, and our results might therefore overestimate 
the drug effect. Unfortunately, eliminating both sources of bias 
simultaneously is difficult, if not impossible. Although clinical 
trials now need to be registered in advance to be published in 
major medical journals,"" there is no requirement that the results 
be submitted for publication, and many failed clinical trials or 
clinical trials with negative results go unpublished.'" 
Furthermore, although many chnical trials are subject to 
mandatory reporting of results to the FDA, most are not, and 
for those that are, as many as 78% fail to comply with this 
requirement.*' Because sponsorship bias is in the direction of 
greater effects for industry sponsored trials, our results might 
overestimate the effects of Z drug hypnotics for treating adult 
insomnia. 
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We found evidence of a significant placebo response for sleep 
latency. McCall and colleagues undertook a meta-analysis of 
sleep changes associated with placebo in published hypnotic 
clinical trials and found a clinically important and statistically 
significant placebo response for subjective sleep latency and 
total sleep time." Belanger and colleagues undertook a 
meta-analysis of sleep changes in control groups of 34 hypnotic 
drug studies in which 23 used a pharmacological placebo, four 
a psychological placebo, and seven a waiting list. They found 
significant pre-post changes in the pharmacological placebo 
group on several sleep outcomes, both objectively or 
subjectively measured, suggesting that sleep measures might 
change significantly in response to a pharmacological placebo.'" 

The response to placebo is more than just the placebo effect. 
Just as the effect of a drug is estimated by the difference between 
the response to the drug and the response to a placebo, the 
placebo effect would be the difference between the placebo 
response and changes occurring without administration of a 
placebo. Belanger and colleagues assessed the response to 
placebo hypnotic drugs and compared it with sleep changes 
among patients placed on a waiting list."* Compared with those 
on a waiting list, there were significantly greater improvements 
in subjective sleep onset latency (19.55 min v 2.43 min), 
subjective total sleep time (31.13 min v 7.30 min), and objective 
total sleep time (18.27 min v 10.34 min) in the placebo group.'* 
These data were based on comparisons between studies rather 
than comparisons within studies, and none of the trials in the 
FDA database included waiting list controls. Nevertheless, the 
results of Belanger and colleagues suggest that the placebo 
response observed in our meta-analysis was largely caused by 
a genuine placebo effect. Future clinical trials including both 
placebo and untreated (natural course) controls would be useful, 
as well as combining the results of studies using network 
meta-analysis. 

Meaning of the study 

The response to a medical treatment consists of two components: 
a true drug effect and a non-specific placebo response, which 
includes the placebo effect, regression toward the mean, and 
improvement because of the natural course of the condition. 
For that reason, it is useful, both for current clinical practice 
and for future treatment development, to know the effect sizes 
for the placebo group as well as for the control group. For 
example, finding that both placebos and drugs are effective but 
that the drug is more effective than the placebo, suggests that 
placebo characteristics can be used to amplify effectiveness of 
a drug. Conversely, finding improvement only in drug arms 
indicates that the placebo effect is not an important component 
of treatment, whereas finding that both are equally effective, 
compared with waiting list controls, suggests that non-specific 
aspects of patient care might be having positive effects. 

We found that both the drug effect and the placebo response 
were small and of questionable clinical importance. The two 
put together, however, lead to a reasonably large clinical 
response. Although the drug-placebo difference in objectively 
measured sleep latency was only 22 minutes, the response to 
the Z drugs, including both drug effect and the placebo effect 
components, was 42 minutes. Similarly, the effect size for the 
drug response was -0.93 and that for the placebo response was 
-0.39, accounting for about half of the drug response. 

Insomnia is a symptom defined disorder characterised by distress 
about perceived poor sleep or lack of sleep. Hence, subjective 
sleep latency might be as important as objective sleep latency 
in understanding the benefits of treatments for this condition. 



The response to Z drugs was 25 minutes shorter for subjectively 
perceived sleep latency, whereas the response to placebo was 
an improvement of 19 minutes. Thus the benefit of Z drugs in 
term of subjectively perceived sleep latency was only seven 
minutes and was not significant. However, this was based on 
only two comparisons. Effect sizes for subjective sleep latency 
were calculable for a larger number of trials and the 
drug-placebo difference (-0.33) was small but significant, with 
the placebo response again accounting for about half of the drug 
response. 

Taken together, these data suggest that the placebo response is 
a major contributor to the effectiveness of Z drugs. The 
remaining effect needs to be balanced against the harms 
associated with these drugs. The substantial proportion of the 
drug response accounted for by the placebo response indicates 
the importance of non-specific factors in the treatment of 
insomnia. As the placebo effect is a psychological phenomenon, 
these data suggest that increased attention should be directed at 
psychological interventions for insomnia. 

Unanswered questions and future research 

FDA data could also provide further opportunities for studying 
effects of adverse effects with Z drugs (particularly as larger 
effect sizes were associated with higher drug doses), as well as 
examining issues of publication and reporting delays and bias. 
We did not look at adverse effects, which can pose significant 
risks,*' leading to concerns about the widespread and sometimes 
inappropriate use of these drugs." *^ 

Conclusion 

This study of FDA data shows that Z drugs improve objective 
and subjective sleep latency compared with placebo, particularly 
in younger and female patients. The size of this effect, however, 
is small and needs to be balanced with concerns about adverse 
effects, tolerance, and potential addiction. The placebo response 
accounted for about half of the drug response. This suggests 
that increased attention should be directed at psychological 
interventions for insomnia. 
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Tables 



Table I Characteristics of studies included in review of Z drugs (a more detailed version of this table is in appendix 3) 



Mean 



Study identifier, 












Daseiine rou 






1/091" intn/ /M/\ 
ycdi , uuuiiiry \nij 






age 






^dUUJcUllvc^ bleep 




Patient 


of sites) 


No 


Women 


(years) 


Drug 


Recruitment 


latency 


Design 


type Outcome 




21 2 


OO /o 


AA M 


^OipiUelll IVIn 


Community 


IntGrvGntton! 41 .7 


rhase II multicentre 


Outpatients Wal<e after sleep onset 


1 IC IOQ\ P^noHa 








12.5 mg 




/fil A\- nla/^Qhrt- A'i Q 
{O I .4J, pidCeuU. 4o.o 


randomised double 


PSG, sleep latency 


{O), MUblldlld \0) 














blind placebo 


PSG, No of awai^enings 
















controlled parallel 


PSG, No of awal^enings 
















group 


subjective, sleep 


















latency subjective, total 


















•^Ippn timp PSG 




one; 


CTO/ 




Zolpidem MR 


Community 


IntGrvGntiori! 36.9 


Randomised 


Outpatients Wai<e after sleep onset 


Argsntina (5), 








6.25 mg 




yOD.UJ, pidOcUU. OO./ 


multicentre double 


PSG, sleep latency 


L/dildUd [ 1 ), 














blind placebo 


PSG, No of awai^enings 


Franco (4), 














controlled 


PSG, No of awai^enings 


yjGi 1 1 icii ly \v f, 
















ciihippti\/p cippn 


Mexico (2), US 
















latGncy subjoctivG, total 


(16) 
















sIggp timo PSG 


1 QUI 7 i QQfi 1 IQ 
Lorl 1 /, moo, Uo 


/ 0 


D'+ /o 


QQ 
OO 


Zolpidem MR 


Community 


IntcrvGntion! zolpiuGm 


Double blind parallel 


, CI 1 DO/-' 

Outpatients SIggp latoncy PSG, 










1 U 1 [ iy di lU 1 Q 




\\j lily. oQ.o ^00.4^, 


group 


Olcfcjp clliuicilUy rod, 










mg 




7nlniHfim 1 R mn' A7 0 




Mn nf a\A/aU(3ninnc P^(^ 
1 "lu ui avvdrvd L[i ly a \ ovj , 














(61.0); placebo: 49.9 




sleep latency 














(70.4) 




subjective, total sleep 


















time subjective. No of 


















awai<enings subjective, 


















sleep quality subjective. 


1 CU 1 QQO 1 IQ IR\ 


\ 40 


OO /o 


40 


Zolpidem (10 


Community 


IntGrvGntion! zolpiuGm 


Double blind parallel 


Outpatients Sleep latency 










mg and 15 mg) 




in mn ^fiR 1V 7nlniHpm 

iv/ Illy ^wo. 1 J J i i 


group 


subjective, total sleep 














15 mg (75.9); placebo: 




time subjective. No 














(58.2) 




awakenings subjective. 


















sleep quality subjective 


l\/ 1 QUI MD MP 
IV Lorl, iNn, INr\ 


/ 0 


MP 


MP 


Zolpidem 10 


Community 


IntGrvGntion! zolpiuGm 


Multicentre double 


Outpatients Sleep latency PSG, 










mg and 15 mg 




1 V/ Illy. \^ f .v^? 


blind randomised 


sleep latency 














Zolpidem 15 mg: 47.0 


placebo controlled 


subjective. Sleep 














(61.0); placebo: 49.9 


parallel group trial 


efficiency PSG 














(70.4) 






204-EU, 1997, 


130 


NR 


NR 


ZaIepionlOmg Community 


intervention' zaiepion 


Phase II, multicentre. Outpatients Sleep latency PSG 


Spain (4), France 








and 20 mg vs. 




10 mg: 40.4; zaiepion 


double blind 




(3), Belgium (3), 








Zolpidem 10 mg 




20 mg: 48.0; Zolpidem 


comparative parallel 




Netherlands (1) 












10 mg: 47.8; placebo; 


group efficacy, safety. 
















48 


tolerance, outpatient 


















and sleep laboratory 


















trial 




Trial 301, 1998, 


586 


58.4% 


41.8 


Zaiepion 5 mg. 


Community 


intervention: zaiepion 5 


Randomised placebo Outpatients Sleep latency PSG. 


US (27) 








10 mg and 20 




mg: 81.5; zaiepion 10 


controlled parallel 












mg; Zolpidem 




mg: 77.7; zaiepion 20 


group multicentre 












10 mg 




mg: 72.5; Zolpidem 10 


double blind trial 
















mg: 70.5; placebo: 80.4 






Trial 307, 1998, 


637 


60.6% 


43 


Zaiepion 10 


Community 


intervention: zaiepion 


Randomised placebo Outpatients Sleep latency PSG 


US and Canada 








mg/10 mg, 10 




10 mg/10 mg: 79.8; 


controlled parallel 




(39) 








mg/20 mg 




zaiepion 10 mg/20 mg: 


group multicentre 
















81.9; placebo: 77.93 


double blind trial 




Trial 303, 1998, 


574 


64.4% 


42.8 


Zaiepion 5 mg. 


Community 


intervention: zaiepion 5 


Randomised placebo 


NR Sleep latency PSG 


Europe and 








10 mg, 20 mg. 




mg: 66.0; zaiepion 10 


controlled parallel 




Canada 








Zolpidem 10 mg 




mg: 57.0; zaiepion 20 


group multicentre 
















mg: 55.0; Zolpidem 10 


double blind trial 
















mg: 64.0; placebo: 58.0 






Trial 306, 1998, 


422 


64.4% 


72.5 


Zaiepion 5 mg 


Community 


intervention: zaiepion 5 


Prospective 


NR Sleep latency PSG 


US 








and 1 0 mg 




mg: 62.1; zaiepion 10 


randomised double 
















mg: 70.7; placebo: 68.0 


blind placebo 





controlled five arm 
parallel group 
multicentre trial 



No commercial reuse: See rights and reprints http;//www.bmj. com/permissions Subscribe: http://www.bmi.com/subscribe 



e/WJ2012;345:e8343doi: 10.1136/bmj.e8343 (Published 17 December 2012) 



Page 8 of 13 



RESEARCH 



Table 1 (continued) 



Study identifier, 
year, country (No 
of sites) 


No 


Women 


Mean 
(SD) 
age 

(years) 


Drug 


Recruitment 


Baseline PSG 
(subjective) sleep 
latency 


Design 


Patient 
type 


Outcome 


190-049, 2003, 
US and Canada 
(69) 


791 


63.2% 


44.1 


Eszopiclone 3 
mg 


NR 


Intervention: NR; 
control: NR 


Multicentre 
randomised trial 


NR 


Sleep latency 
subjective, total sleep 
time subjective, wake 
after sleep onset 
subjective 


190-047, 2003, 
USA (48), Canada 
(2) 


292 


65.9% 


70.7 


Eszopiclone 2 
mg 


NR 


Intervention: NR; 
control: NR 


Multicentre 
randomised trial 


NR 


Sleep latency PSG, 
sleep efficiency PSG, 
wake after sleep onset 
PSG 


190-048, 2003, 
US and Canada 
(32) 


234 


57.7% 


72.3 


Eszopiclone 1 
mg and 2 mg 


NR 


Intervention: NR; 
control: NR 


Multicentre 
randomised trial 


NR 


Sleep latency 
subjective, total sleep 
time subjective 



NR=not reported; MR=modified release; PSG=polysomnograpiiic. 

'Trial 307-1 998 had two intervention arms: (1) zaieplon 1 0 mg for 1 4 days with outcomes measured at 7 days and 1 4 days compared with placebo and (11) zaieplon 
1 0 mg for 7 days followed by 20 mg for 7 days with outcomes measured at 7 days and 1 4 days compared with placebo; in both studies we used last measurement 
at 14 days and averaged dose at 15 mg as best approximation for study arm using 10 mg followed by 20 mg zaieplon. 
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Table 2| Weighted standardised mean differences in effect of Z drugs (treatment) or placebo on insomnia 

Weighted mean effect size (95% CI) 



Within groups Between groups Homogeneity of effect sizes 1^ (95% CI) 





No* 


Treatment 


Control 


No* 


Treatment v control 


Treatment 


Control 


Treatment v 
control 


Primary outcome-sleep latency 


PSG 


16 


-0.93 (-1.32 to 
-0.54) 


-0.39 (-0.54 to 
-0.23) 


22 


-0.36 (-0.57 to -0.1 6) 


89 (83.58 to 
92.47) 


0 (0 to 77.60) 


41 (1.1 8 to 64.30) 


Subjective 


4 


-0.67 (-1.30 to 
-0.03) 


-0.33 (-0.63 to 
-0.03) 


11 


-0.33 (-0.62 to -0.04) 


0 (0 to 49.26) 


0 (0 to 66.15) 


83 (70.98 to 
90.05) 


Secondary outcomes 


Wake after sleep 
onset (PSG) 


2 


-0.52 (-1.40 to 
0.36) 


-0.29 (-0.67 to 
-0.08) 


3 


-0.24 (-0.72 to 0.24) 


83 (29.36 to 
95.94) 


50 (0 to 87.35) 


0 (0 to 93.84) 


No of awakenings 
(PSG) 


2 


-0.36 (-1.28 to 
0.56) 


-0.21 (-0.60 to 
0.17) 


4 


-0.33 (-0.80 to 0.14) 


91 (70 to 97.57) 


0 (0 to 99.67) 


65 (0 to 88.10) 


No of awakenings 
(subjective) 


2 


-0.91 (-1.90 to 
0.09) 


-0.28 (-0.66 to 
0.10) 


6 


-0.06 (-0.42 to 0.29) 


87 (47.31 to 
96.63) 


36 (0 to 79.25) 


85 (69 to 92.67) 


Total sleep time 
(PSG) 


2 


1.06 (-1.37 to 
3.49) 


0.65 (-0.67 to 
1.98) 


2 


0.41 (-0.51 to 1.32) 


74 (0 to 94.19) 


42 (0 to 83.1 4) 


0 


Sleep efficiency 
(PSG) 


2 


0.52 (-1.23 to 
2.28) 


0 (-0.59 to 0.59) 


5 


0.59 (-0.12 to 1.29) 


54 (0 to 88.82) 


0 


0 (0 to 75.02) 


Total sleep time 
(subjective) 


0 






7 


0.45 (-0.08 to 0.98) 






0(0 to 71.12) 


Sleep quality 
(subjective) 


0 






4 


0.30 (-0.32 to 0.92) 






0(0 to 71.13) 


Wake after sleep 
onset (subjective) 


0 






1 


-0.16 (-0.60 to 0.28) 








PSG=polysomnographic. 
*No of comparisons. 
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Table 3| Weighted mean raw differences in effect of Z drugs (treatment) or placebo on insomnia 

Weighted mean differences (95% CI) 



Within groups Between groups Homogeneity of effect sizes f (95% CI) 





No* 


Treatment 


Control 


No* 


Treatment v control 


Treatment 


Control 


Treatment v 
Control 


Primary outcome-sleep latency 


PSG 


14 


-42 (-60 to -23) 


-20 (-28 to -1 1 ) 


14 


-22 (-33.00 to 
-11.00) 


96 (94.75 to 
97.15) 


41 (0 to 68.68) 


94 (91 .66 to 
95.83) 


Subjective 


2 


-24.99 (-30.06 to 
-19.92) 


-19.43 (-26.61 to 
-12.25) 


2 


-6.90 (-26.00 to 
12.37) 


0 (0 to 100) 


0(0 to 100) 


27 (0 to 72.41) 


Secondary outcomes 


Wal<e after sleep 
onset (PSG) 


2 


-20 (-59 to 18) 


-13 (-34 to 7.89) 


2 


-7.14 (-33.00 to 
18.23) 


65 (0 to 91 .96) 


63 (0 to 91.62) 


0 (0 to 99.98) 


No of awakenings 
(PSG) 


2 


1 .24 (-6.34 to 
3.89) 


-0.94 (-12 to 9.99) 


2 


-0.47 (-5.1 2 to 4.17) 


94 (81.24 to 
98.13) 


0 (0 to 99.98) 


0 (0 to 99.90) 


No awahienings 
(subjective) 


2 


2.88 (-7.15 to 
1.39) 


-1.05 (-4.86 to 
2.76) 


2 


-1.77 (-4.61 to 1.07) 


0 (0 to 100) 


0 (0 to 99.95) 


0(0 to 99.81) 


Total sleep time 
(PSG) 


2 


49.15 (-60 to 16) 


35.10 (-34 to 10) 


2 


14.05 (-31.00 to 
58.72) 


63 (0 to 91 .45) 


61 (0 to 91.07) 


0 (0 to 99.68) 


Sleep efficiency 
(PSG) 


1 


4.27 (2.01 to 6.52) 


0 (-2.52 to 2.52) 


1 


4.47 (2.08 to 6.86) 








Total sleep time 
(subjective) 


0 






0 










Sleep quality 
(subjective) 


0 






0 










Wake after sleep 
onset (subjective) 


0 






0 










PSG=polysomnographlc. 
*No of comparisons. 
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Table 1 Moderator analysis for effect of Z drugs (treatment) or placebo on insomnia. Effect sizes are weighted standardised mean differences 


Moderator variablef 


Effect size (95% CI) 




Polysomnographic sleep latency 


Dose (22 comparisons): 


1 mg 


-0.24(-0.38 to -0.11) 


-0.22* 


20 mg 


-0.50 (-0.64 to -0.35) 




Subjective sleep latency 


Year of data collection (9 comparisons): 


1988 


-0.88 (-1.19 to -0.58) 


0.63"* 


2004 


-0.03 (-0.16 to 0.10) 




Age (9 comparisons) : 


38 years 


-0.65 (-0.82 to -0.48) 


0.89*** 


72 years 


0.31 (0.13 to 0.50) 




Percentage of women (9 comparisons): 


55.9 % 


0.01 (-0.17 to 0.18) 


-0.40** 


67.5 % 


-0.67 (-0.99 to -0.34) 




Type of drug (1 1 comparisons): 


Eszopiclone (3 comparisons) 


0.02 (-0.1 3 to 0.18) 


-0.57** 


Zolpidem (8 comparisons) 


-0.47 (-0.61 to -0.33) 




Dose (11 comparisons): 


1 mg 


0.13 (-037 to 0.30) 


-0.70*** 


20 mg 


-1.01 (-1.31 to -0.70) 





*P<0.05; **P<0.01 ; ***P<0 .001 . 

tEffect size of each outcome was entered as dependent variable into separate weighted least squares regressions under fixed effects assumptions for each 
moderator variable independently; negative ds imply lower outcome at final available measures; estimates appear for observed extremes of continuous features. 
^Standardised regression coefficient. 
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Figures 



Records identified tiirougii searching FDA database (n=A6 trials) 
I 

Records after duplicates removed (n=46 trials) 



Records screened (n=A6 trials) 

Records excluded summary or 
insufficient or no data (n=10 trials) 

Full text articles assessed for eligibility (n=36 trials) 

Full text articles excluded: 
Crossover design {n=18 trials) 
No placebo control group (n-7 trials) 
Healthy patients with normal sleep pattern 
(n=10 trials) 

— ^ Single night studies (induced insomnia) (n=3 trials) 
Low quality studies (n=0 trials) 
Not enough statistical information to obtain an 
effect size (n=2 trials) 
(some studies had more than one reason for 
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Fig 1 Identification of studies from FDA databases and inclusion in study 
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Fig 2 Forest plot for polysomnographic sleep latency under random effects assumptions 
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